Processing math: 100%

Continuous Latent Variables

Principal Component Analysis

Introduction

Principal Component Analysis is widely used for applications such as dimensionality reduction,lossy data compression,feature extraction,and data visualization.Also known as the Karhunen-Loeve transform.There are two definitions giving rise to the same algorithm.PCA can be defined as the orthogonal projection of the data onto a lower dimensional linear space,known as the principal subspace,such that the variance of the projected data is maximized.Equivalently,it can be defined as the linear projection the minimizes the average projection cost, defined as the linear projection that minimizes the average projection cost,defined as the mean squared distance between the data points and their projections.

Maximum variance formulation

Consider a data set of observations {xn} where n=1,...,N,and xn is a Euclidean variable with dimensionality D. Our goal is to project the data onto a space having dimensionality M<D while maximizing the variance of the projected data.We define the direction of this space using a D-dimensional unit vector uT1u1=1.Each data point xn is then projected onto a scalar value uT1xn.The mean of the projected data is uT1ˉx where the ˉx is the sample set mean given by ˉx=1NNn=1xn and the variance of the projected data is given by 1NNn=1{uT1xnuT1ˉx}2=1NNn=1{uT1(xnˉx)}2=1NNn=1{uT1(xnˉx)(xnˉx)TuT1}=uT1Su1 where S is the data covariance matrix defined by S=1NNn=1(xnˉx)(xnˉx)T We now maximize the projected variance uT1Su1 with respect to u1,which is a constrained maximization to prevent u1∥→ .The appropriate constraint comes from the normalization condition uT1u1=1.To enforce this constraint,we introduce a Lagrange multiplier that we shall denote by λ1,and then make an unconstrained maximization of uT1Su1+λ1(1uT1u1) By setting the derivative with respect to u1 equal to zero,we see that this quantity will have a stationary point when Su1=λ1u1 which says that u1 must be an eigenvector of S.If we left-multiply by uT1 and make use of uT1u1=1,we see that the variance is given by uT1Su1=λ1 and so the variance will be a maximum when we set u1 equal to the eigenvector having the largest eigenvalue λ1.This eigenvector is known as the first principal component.

Minimum-error formulation

Applications of PCA

PCA for high-dimensional data

Probabilistic PCA

Kernel PCA

Nonlinear Latent Variable Models